Regression Methodology Based Disclosure of a Statistical Database

نویسنده

  • Michael A Palley
چکیده

A s t a t i s t i c a l database serves two ma jo r purposes : to p r o v i d e the s t a t i s t i c i a n w i t h a c c u r a t e aggrega te s t a t i s t i c a l i n f o r m a t i o n , and to p r o t e c t the c o n f i d e n t i a l i t y of i n d i v i d u a l database r e c o r d s . A t e c h n i q u e is p resen ted which u t i l i z e s r e g r e s s i o n methodo logy to compromise c o n f i d e n t i a l i n f o r m a t i o n in a s t a t i s t i c a l da tabase . In the case t h a t a database management system p r e c l u d e s a p p l i c a t i o n of r e g r e s s i o n me thodo logy , the research i n t r o d u c e s the n o t i o n of a " s y n t h e t i c d a t a b a s e " , c rea ted th rough l e g i t i m a t e means, which c i r cumven ts t h i s c o n t r o l , and once again p e r m i t s d i s c l o s u r e th rough r e g r e s s i o n me thodo logy . The approach is v a l i d a t e d on v a r i o u s subsets of the 1980 U.S. Census m i c r o d a t a f o r the S ta te of New York. F i n a l l y , the r e g r e s s i o n methodo logy approach i s exami ned i n i t s abi I i t y to cause d i s c l o s u r e even where va r i ous e x i s t i n g c o n f i d e n t i a l i t y p r o t e c t i o n measures are

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Logistic Regression with Variables Subject to Post Randomization Method

An increase in quality and detail of publicly available databases increases the risk of disclosure of sensitive personal information contained in such databases. The goal of Statistical Disclosure Control (SDC) is to develop methodology that aims at minimizing disclosure risk while providing society with as much information as possible needed for valid statistical inference. The Post Randomizat...

متن کامل

A Quantitative Examination of Firm Factors Influencing Occupational Health and Safety Disclosures in Annual Reports

This paper uses binary logistic regression to develop two models of firms’ Occupational Health and Safety disclosures, one based on disclosure / non-disclosure, the other based on above / below the median levels of disclosure. Industry and auditor are found to be important components of both models, whilst operating revenue contributes to the former and company age to the latter. These findings...

متن کامل

Noise Multiplication for Statistical Disclosure Control of Extreme Values in Log-normal Regression Samples

Statistical agencies must control disclosure risk when releasing data to the public. If income data on individuals or businesses are released, it could be possible to match extremely large values to specific individuals or businesses that are known to be wealthy, especially if some additional information is available on the same units in the dataset. The purpose of the present investigation is ...

متن کامل

A Method for Protecting Access Pattern in Outsourced Data

Protecting the information access pattern, which means preventing the disclosure of data and structural details of databases, is very important in working with data, especially in the cases of outsourced databases and databases with Internet access. The protection of the information access pattern indicates that mere data confidentiality is not sufficient and the privacy of queries and accesses...

متن کامل

The Role of Managers Ownership in the Relationship between the Disclosure of Corporate Ethical & Social Responsibility & Cost of Equity

Background: Accurate and timely disclosure of information is one of the most important tools for company managers to reduce the cost of capital. The purpose of this study is to investigate the role of managerschr('39') ownership in the relationship between the disclosure of corporate social and moral responsibility and the cost of equity in companies listed on the Tehran stock exchange. Method...

متن کامل

A posteriori Disclosure Risk Measure for Tabular Data Based on Conditional Entropy∗

Statistical database protection, also known as Statistical Disclosure Control (SDC), is a part of information security which tries to prevent published statistical information (tables, individual records) from disclosing the contribution of specific respondents. This paper deals with the assessment of the disclosure risk associated to the release of tabular data. So-called sensitivity rules are...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002